Can we trust the bootstrap in high-dimension?

نویسندگان

  • Noureddine El Karoui
  • Elizabeth Purdom
چکیده

We consider the performance of the bootstrap in high-dimensions for the setting of linear regression, where p < n but p/n is not close to zero. We consider ordinary least-squares as well as robust regression methods and adopt a minimalist performance requirement: can the bootstrap give us good confidence intervals for a single coordinate of ? (where is the true regression vector). We show through a mix of numerical and theoretical work that the bootstrap is fraught with problems. Both of the most commonly used methods of bootstrapping for regression – residual bootstrap and pairs bootstrap – give very poor inference on as the ratio p/n grows. We find that the residuals bootstrap tend to give anti-conservative estimates (inflated Type I error), while the pairs bootstrap gives very conservative estimates (severe loss of power) as the ratio p/n grows. We also show that the jackknife resampling technique for estimating the variance of ̂ severely overestimates the variance in high dimensions. We contribute alternative bootstrap procedures based on our theoretical results that mitigate these problems. However, the corrections depend on assumptions regarding the underlying data-generation model, suggesting that in high-dimensions it may be difficult to have universal, robust bootstrapping techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explaining the Relationship of Social trust on the Citizenship Ethics of, High School Students in Bushehr City

the aim of this study is explaining the relationship between social trust with citizenship ethic. The study population consisted of all high school students in Bushehr city that among this population, a sample of 360 students were selected by multistage random sampling method. the research tools included multi dimension comparison of recognized social support, questionnaire of social trust and ...

متن کامل

Functional Analysis of Iranian Temperature and Precipitation by Using Functional Principal Components Analysis

Extended Abstract. When data are in the form of continuous functions, they may challenge classical methods of data analysis based on arguments in finite dimensional spaces, and therefore need theoretical justification. Infinite dimensionality of spaces that data belong to, leads to major statistical methodologies and new insights for analyzing them, which is called functional data analysis (FDA...

متن کامل

Optimum Block Size in Separate Block Bootstrap to Estimate the Variance of Sample Mean for Lattice Data

The statistical analysis of spatial data is usually done under Gaussian assumption for the underlying random field model. When this assumption is not satisfied, block bootstrap methods can be used to analyze spatial data. One of the crucial problems in this setting is specifying the block sizes. In this paper, we present asymptotic optimal block size for separate block bootstrap to estimate the...

متن کامل

Investigating the Mutual Effects of Social Capital and Quality of Life in Urban Neighborhoods Using Structural Equation Modeling (Case Study: Sultan Mir-Ahmad and Fin in Kashan)

Urban neighborhoods are a suitable context to form sustainable social relationships and increase trust and social participation. With the development of cities and fundamental changes in the lifestyles of residents, we are faced with widespread changes and even disconnections in close relationships. For this, the concept of social capital in local communities can be effectively employed for imp...

متن کامل

Interpersonal Trust in Online Scientific Social Networks: Causes and Results

Background and Aim: This study tends to investigate the reasons of interpersonal trust and the results of trust in online scientific social networks. Methods: The applied Research has been used cluster sampling to collect data. The study population consisted of Shiraz university and Persian Gulf university faculties. A sampling of 269 person was determined by Morgan table according to whole pop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015